Although weakly-supervised techniques can reduce the labeling effort, it is unclear whether a saliency model trained with weakly-supervised data (e.g., point annotation) can achieve the equivalent performance of its fully-supervised version. This paper attempts to answer this unexplored question by proving a hypothesis: there is a point-labeled dataset where saliency models trained on it can achieve equivalent performance when trained on the densely annotated dataset. To prove this conjecture, we proposed a novel yet effective adversarial trajectory-ensemble active learning (ATAL). Our contributions are three-fold: 1) Our proposed adversarial attack triggering uncertainty can conquer the overconfidence of existing active learning methods and accurately locate these uncertain pixels. {2)} Our proposed trajectory-ensemble uncertainty estimation method maintains the advantages of the ensemble networks while significantly reducing the computational cost. {3)} Our proposed relationship-aware diversity sampling algorithm can conquer oversampling while boosting performance. Experimental results show that our ATAL can find such a point-labeled dataset, where a saliency model trained on it obtained $97\%$ -- $99\%$ performance of its fully-supervised version with only ten annotated points per image.
translated by 谷歌翻译
Nesterov's accelerated gradient descent (NAG) is one of the milestones in the history of first-order algorithms. It was not successfully uncovered until the high-resolution differential equation framework was proposed in [Shi et al., 2022] that the mechanism behind the acceleration phenomenon is due to the gradient correction term. To deepen our understanding of the high-resolution differential equation framework on the convergence rate, we continue to investigate NAG for the $\mu$-strongly convex function based on the techniques of Lyapunov analysis and phase-space representation in this paper. First, we revisit the proof from the gradient-correction scheme. Similar to [Chen et al., 2022], the straightforward calculation simplifies the proof extremely and enlarges the step size to $s=1/L$ with minor modification. Meanwhile, the way of constructing Lyapunov functions is principled. Furthermore, we also investigate NAG from the implicit-velocity scheme. Due to the difference in the velocity iterates, we find that the Lyapunov function is constructed from the implicit-velocity scheme without the additional term and the calculation of iterative difference becomes simpler. Together with the optimal step size obtained, the high-resolution differential equation framework from the implicit-velocity scheme of NAG is perfect and outperforms the gradient-correction scheme.
translated by 谷歌翻译
The hyperparameter optimization of neural network can be expressed as a bilevel optimization problem. The bilevel optimization is used to automatically update the hyperparameter, and the gradient of the hyperparameter is the approximate gradient based on the best response function. Finding the best response function is very time consuming. In this paper we propose CPMLHO, a new hyperparameter optimization method using cutting plane method and mixed-level objective function.The cutting plane is added to the inner layer to constrain the space of the response function. To obtain more accurate hypergradient,the mixed-level can flexibly adjust the loss function by using the loss of the training set and the verification set. Compared to existing methods, the experimental results show that our method can automatically update the hyperparameters in the training process, and can find more superior hyperparameters with higher accuracy and faster convergence.
translated by 谷歌翻译
Aspect Sentiment Triplet Extraction (ASTE) has become an emerging task in sentiment analysis research, aiming to extract triplets of the aspect term, its corresponding opinion term, and its associated sentiment polarity from a given sentence. Recently, many neural networks based models with different tagging schemes have been proposed, but almost all of them have their limitations: heavily relying on 1) prior assumption that each word is only associated with a single role (e.g., aspect term, or opinion term, etc. ) and 2) word-level interactions and treating each opinion/aspect as a set of independent words. Hence, they perform poorly on the complex ASTE task, such as a word associated with multiple roles or an aspect/opinion term with multiple words. Hence, we propose a novel approach, Span TAgging and Greedy infErence (STAGE), to extract sentiment triplets in span-level, where each span may consist of multiple words and play different roles simultaneously. To this end, this paper formulates the ASTE task as a multi-class span classification problem. Specifically, STAGE generates more accurate aspect sentiment triplet extractions via exploring span-level information and constraints, which consists of two components, namely, span tagging scheme and greedy inference strategy. The former tag all possible candidate spans based on a newly-defined tagging set. The latter retrieves the aspect/opinion term with the maximum length from the candidate sentiment snippet to output sentiment triplets. Furthermore, we propose a simple but effective model based on the STAGE, which outperforms the state-of-the-arts by a large margin on four widely-used datasets. Moreover, our STAGE can be easily generalized to other pair/triplet extraction tasks, which also demonstrates the superiority of the proposed scheme STAGE.
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译
在一阶算法的历史中,Nesterov的加速梯度下降(NAG)是里程碑之一。但是,长期以来,加速的原因一直是一个谜。直到[Shi等,2021]中提出的高分辨率微分方程框架之前,梯度校正的存在尚未得到揭示。在本文中,我们继续研究加速现象。首先,我们基于精确的观察结果和$ L $ SMOTH功能的不等式提供了明显的简化证明。然后,提出了一个新的隐式高分辨率差分方程框架,以及相应的隐式 - 速度版本的相位空间表示和lyapunov函数,以研究迭代序列$ \ {x_k \} _的迭代序列的收敛行为{k = 0}^{\ infty} $的nag。此外,从两种类型的相空间表示形式中,我们发现梯度校正所起的作用等同于按速度隐含在梯度中包含的作用,其中唯一的区别来自迭代序列$ \ \ {y_ {y_ {k} \} _ {k = 0}^{\ infty} $由$ \ {x_k \} _ {k = 0}^{\ infty} $代替。最后,对于NAG的梯度规范最小化是否具有更快的速率$ O(1/K^3)$的开放问题,我们为证明提供了一个积极的答案。同时,为$ r> 2 $显示了目标值最小化$ o(1/k^2)$的更快的速度。
translated by 谷歌翻译
translated by 谷歌翻译
从搜索效率中受益,可区分的神经体系结构搜索(NAS)已发展为自动设计竞争性深神经网络(DNNS)的最主要替代品。我们注意到,必须在现实世界中严格的性能限制下执行DNN,例如,自动驾驶汽车的运行时间延迟。但是,要获得符合给定性能限制的体系结构,先前的硬件可区分的NAS方法必须重复多次搜索运行,以通过反复试验和错误手动调整超参数,因此总设计成本会成比例地增加。为了解决这个问题,我们引入了一个轻巧的硬件可区分的NAS框架,称为lightnas,努力找到所需的架构,通过一次性搜索来满足各种性能约束(即,\ \ suesperline {\ textIt {您只搜索一次}})) 。进行了广泛的实验,以显示LINDNA的优越性,而不是先前的最新方法。
translated by 谷歌翻译
这项工作提出了一种有丝分裂检测方法,只有一个香草卷积神经网络(CNN)。我们的方法由两个步骤组成:给定图像,我们首先使用滑动窗口技术应用CNN来提取具有有丝分裂的斑块。然后,我们计算每个提取的斑块的类激活图,以获得有丝分裂的精确位置。为了提高模型的推广性,我们使用一系列数据增强技术训练CNN,与噪声标记的图像相抵制的损失以及主动的学习策略。我们的方法在MIDOG 2022挑战的初步测试阶段中,通过有效网络B3模型获得了0.7323的F1得分。
translated by 谷歌翻译